# Fine-grained Local Description
DAM 3B Self Contained
Other
DAM-3B is a vision-language model capable of generating fine-grained local descriptions based on user-specified image regions (points/boxes/sketches/masks).
Image-to-Text English
D
nvidia
824
17
DAM 3B
Other
DAM-3B is a 3-billion-parameter vision-language model capable of generating fine-grained local descriptions for user-specified image regions.
Image-to-Text
Safetensors English
D
nvidia
1,417
81
Featured Recommended AI Models